Distributed Stochastic Optimization with Large Delays
نویسندگان
چکیده
The recent surge of breakthroughs in machine learning and artificial intelligence has sparked renewed interest large-scale stochastic optimization problems that are universally considered hard. One the most widely used methods for solving such is distributed asynchronous gradient descent (DASGD), a family algorithms result from parallelizing on computing architectures (possibly) asychronously. However, key obstacle efficient implementation DASGD issue delays: when node contributes update, global model parameter may have already been updated by other nodes several times over, thereby rendering this information stale. These delays can quickly add up if computational throughput saturated, so convergence be compromised presence large delays. Our first contribution that, carefully tuning algorithm’s step size, to critical set still achieved mean square, even grow unbounded at polynomial rate. We also establish finer results broad class structured (called variationally coherent), where we show converges optimum with probability one under same delay assumptions. Together, these contribute landscape nonconvex offering state-of-the-art theoretical guarantees providing insights algorithm design.
منابع مشابه
Distributed Asynchronous Algorithms with Stochastic Delays for Constrained Optimization Problems with Conditions of Time Drift
A distributed asynchronous algorithm for minimizing a function with a nonstationary minimum over a constraint set is considered. The communication delays among the processors are assumed to be stochastic with Markovian character. Conditions which guarantee the mean square and almost sure convergence to the sought solution are presented. We also present an optimal routing application for a netwo...
متن کاملStochastic Nonconvex Optimization with Large Minibatches
We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large minibatches of samples, using only first-order gradient information. Our algorithms provably converge to an approximate critical point of the expected objectiv...
متن کاملStochastic Optimization of Querying Distributed Databases Ii. Solving Stochastic Optimization
General stochastic query optimization (GSQO) problem for multiple join — join of p relations which are stored at p different sites — is presented. GSQO problem leads to a special kind of nonlinear programming problem (P ). Problem (P ) is solved by using a constructive method. A sequence converging to the solution of the optimization problem is built. Two algorithms for solving optimization pro...
متن کاملNetwork Location Problem with Stochastic and Uniformly Distributed Demands
This paper investigates the network location problem for single-server facilities that are subject to congestion. In each network edge, customers are uniformly distributed along the edge and their requests for service are assumed to be generated according to a Poisson process. A number of facilities are to be selected from a number of candidate sites and a single server is located at each facil...
متن کاملMemory and Communication Efficient Distributed Stochastic Optimization with Minibatch Prox
We present and analyze statistically optimal, communication and memory efficient distributed stochastic optimization algorithms with near-linear speedups (up to log-factors). This improves over prior work which includes methods with near-linear speedups but polynomial communication requirements (accelerated minibatch SGD) and communication efficient methods which do not exhibit any runtime spee...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics of Operations Research
سال: 2022
ISSN: ['0364-765X', '1526-5471']
DOI: https://doi.org/10.1287/moor.2021.1200